AITopics | gradient obfuscation

fb4c48608ce8825b558ccf07169a3421-Supplemental.pdf

Neural Information Processing SystemsApr-27-2026, 22:53:20 GMT

In this section, we perform additional diagnostics that give us confidence that our models are not doing any form of gradient obfuscation or masking [3, 53]. First, we report in Table 4 the robust accuracy obtained by our strongest models against a diverse set of attacks. The cascade is composed as follows: AUTOPGD-CE, an untargeted attack using PGD with an adaptive step on the cross-entropy loss [10], AUTOPGD-T, a targeted attack using PGD with an adaptive step on the difference of logits ratio [10], FAB-T, a targeted attack which minimizes the norm of adversarial perturbations [9], SQUARE, a query-efficient black-box attack [1]. First, we observe that our combination of attacks, denoted AA+MT matches the final robust accuracy measured by AUTOATTACK. Second, we also notice that the black-box attack (i.e., SQUARE) does not find any additional adversarial examples.

accuracy, artificial intelligence, robust accuracy, (17 more...)

Neural Information Processing Systems

Industry: Transportation > Air (0.55)

Technology: Information Technology > Artificial Intelligence (0.70)

Add feedback

fb4c48608ce8825b558ccf07169a3421-Supplemental.pdf

Neural Information Processing SystemsFeb-12-2026, 00:37:24 GMT

accuracy, augmentation, robust accuracy, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.69)

Add feedback

17256f049f1e3fede17c7a313f7657f4-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 14:47:25 GMT

cortical fixation model, fixation model, iteration, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > Canada (0.04)

Industry: Information Technology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.41)

Add feedback

Adversarial Robustness through Local Linearization

Neural Information Processing SystemsDec-25-2025, 00:36:35 GMT

Adversarial training is an effective methodology for training deep neural networks that are robust against adversarial, norm-bounded perturbations. However, the computational cost of adversarial training grows prohibitively as the size of the model and number of input dimensions increase. Further, training against less expensive and therefore weaker adversaries produces models that are robust against weak attacks but break down under attacks that are stronger. This is often attributed to the phenomenon of gradient obfuscation; such models have a highly non-linear loss surface in the vicinity of training examples, making it hard for gradient-based attacks to succeed even though adversarial examples still exist. In this work, we introduce a novel regularizer that encourages the loss to behave linearly in the vicinity of the training data, thereby penalizing gradient obfuscation while encouraging robustness. We show via extensive experiments on CIFAR-10 and ImageNet, that models trained with our regularizer avoid gradient obfuscation and can be trained significantly faster than adversarial training. Using this regularizer, we exceed current state of the art and achieve 47% adversarial accuracy for ImageNet with L-infinity norm adversarial perturbations of radius 4/255 under an untargeted, strong, white-box attack. Additionally, we match state of the art results for CIFAR-10 at 8/255.

adversarial robustness, gradient obfuscation, name change, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.59)

Add feedback

Supplementary Material for Biologically Inspired Mechanisms for Adversarial Robustness

Neural Information Processing SystemsOct-2-2025, 05:52:49 GMT

Results from these preliminary experiments were not reported in the paper but we report the results here in the supplementary materials. The standard bounding boxes were used as provided with the ImageNet dataset. If images had 0 bounding boxes, they were discarded for this dataset. See Section 2 in the main paper and consult Bashivan et al. (2019) for full details on the sampling procedure and chosen parameters. Code from Bashivan et al. (2019) was open-sourced at Biological measurements (Gattass et al. (1981, 1988)) have demonstrated that in primates, the As described in the paper, we employed two baseline models ('ResNet' and'coarse fixations') and two The'ResNet' baseline model directly feeds the full image through a standard ResNet architecture (32x32 for CIFAR10 or 320x320 for ImageNet).

artificial intelligence, evolutionary algorithm, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > Canada (0.04)

Industry: Information Technology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.41)

Add feedback

Adversarial Robustness through Local Linearization

Chongli Qin, James Martens, Sven Gowal, Dilip Krishnan, Krishnamurthy Dvijotham, Alhussein Fawzi, Soham De, Robert Stanforth, Pushmeet Kohli

Neural Information Processing SystemsOct-2-2025, 01:42:56 GMT

Additionally, we match state of the art results for CIFAR-10 at 8/255.

adversarial training, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Adversarial Robustness through Local Linearization

Neural Information Processing SystemsOct-9-2024, 12:46:42 GMT

Adversarial training is an effective methodology for training deep neural networks that are robust against adversarial, norm-bounded perturbations. However, the computational cost of adversarial training grows prohibitively as the size of the model and number of input dimensions increase. Further, training against less expensive and therefore weaker adversaries produces models that are robust against weak attacks but break down under attacks that are stronger. This is often attributed to the phenomenon of gradient obfuscation; such models have a highly non-linear loss surface in the vicinity of training examples, making it hard for gradient-based attacks to succeed even though adversarial examples still exist. In this work, we introduce a novel regularizer that encourages the loss to behave linearly in the vicinity of the training data, thereby penalizing gradient obfuscation while encouraging robustness.

adversarial robustness, gradient obfuscation, local linearization, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.62)

Add feedback

Adversarial Robust Memory-Based Continual Learner

Mi, Xiaoyue, Tang, Fan, Yang, Zonghan, Wang, Danding, Cao, Juan, Li, Peng, Liu, Yang

arXiv.org Artificial IntelligenceNov-29-2023

Despite the remarkable advances that have been made in continual learning, the adversarial vulnerability of such methods has not been fully discussed. We delve into the adversarial robustness of memory-based continual learning algorithms and observe limited robustness improvement by directly applying adversarial training techniques. Preliminary studies reveal the twin challenges for building adversarial robust continual learners: accelerated forgetting in continual learning and gradient obfuscation in adversarial robustness. In this study, we put forward a novel adversarial robust memory-based continual learner that adjusts data logits to mitigate the forgetting of pasts caused by adversarial samples. Furthermore, we devise a gradient-based data selection mechanism to overcome the gradient obfuscation caused by limited stored data. The proposed approach can widely integrate with existing memory-based continual learning as well as adversarial training algorithms in a plug-and-play way. Extensive experiments on Split-CIFAR10/100 and Split-Tiny-ImageNet demonstrate the effectiveness of our approach, achieving up to 8.13% higher accuracy for adversarial data.

adversarial training, continual learning, robustness, (13 more...)

arXiv.org Artificial Intelligence

2311.17608

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States (0.14)
Asia > China > Beijing > Beijing (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Attacking Adversarial Defences by Smoothing the Loss Landscape

Eustratiadis, Panagiotis, Gouk, Henry, Li, Da, Hospedales, Timothy

arXiv.org Artificial IntelligenceAug-5-2022

This paper investigates a family of methods for defending against adversarial attacks that owe part of their success to creating a noisy, discontinuous, or otherwise rugged loss landscape that adversaries find difficult to navigate. A common, but not universal, way to achieve this effect is via the use of stochastic neural networks. We show that this is a form of gradient obfuscation, and propose a general extension to gradient-based adversaries based on the Weierstrass transform, which smooths the surface of the loss function and provides more reliable gradient estimates. We further show that the same principle can strengthen gradient-free adversaries. We demonstrate the efficacy of our loss-smoothing method against both stochastic and non-stochastic adversarial defences that exhibit robustness due to this type of obfuscation. Furthermore, we provide analysis of how it interacts with Expectation over Transformation; a popular gradient-sampling method currently used to attack stochastic defences.

adversarial defence, gradient, loss landscape, (16 more...)

arXiv.org Artificial Intelligence

2208.00862

Country: North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Second Order Optimization for Adversarial Robustness and Interpretability

Tsiligkaridis, Theodoros, Roberts, Jay

arXiv.org Machine LearningSep-10-2020

Deep neural networks are easily fooled by small perturbations known as adversarial attacks. Adversarial Training (AT) is a technique aimed at learning features robust to such attacks and is widely regarded as a very effective defense. However, the computational cost of such training can be prohibitive as the network size and input dimensions grow. Inspired by the relationship between robustness and curvature, we propose a novel regularizer which incorporates first and second order information via a quadratic approximation to the adversarial loss. The worst case quadratic loss is approximated via an iterative scheme. It is shown that using only a single iteration in our regularizer achieves stronger robustness than prior gradient and curvature regularization schemes, avoids gradient obfuscation, and, with additional iterations, achieves strong robustness with significantly lower training time than AT. Further, it retains the interesting facet of AT that networks learn features which are well-aligned with human perception. We demonstrate experimentally that our method produces higher quality human-interpretable features than other geometric regularization techniques. These robust features are then used to provide human-friendly explanations to model predictions.

artificial intelligence, machine learning, robustness, (18 more...)

arXiv.org Machine Learning

2009.04923

Country: